Data wrangling with the Tidyverse

Hezi Buba & Irene Steves
31 October 2018

Projects in R

Why use projects?

Code is usually meant to be run more than once. Projects help to maintain good workflow habits:

  • Using fresh R processes - each project comes with its own environment; restart R regularly to run scripts in clean R environments. Storing variables in your environment can lead to problems later on
  • Portability - projects have clearly defined base directories; use relative paths to keep projects self-contained and portable – your collaborators (including future-you!) will thank you

Extra reading: workflow versus script

GitHub

GitHub is a code storage/sharing platform. Use it to:

  • browse source code (CRAN, tidyverse, rOpenSci, r-lib)
  • use packages not on CRAN (devtools::install_github("account_name/repo_name"))
  • share/store analyses and functions
  • browse past versions of code

Cloning GitHub repositories

If you don't have git installed: get GitHub files off the web and onto your local computer by clicking Download ZIP.

Cloning GitHub repositories

If you have git installed and ready to go on RStudio (excellent instructions here), you can do the following:

(1) FORK the repository (make a copy of it to your github.com account)

(2) In RStudio: File --> New Project --> Version Control --> Git. The repository URL should be in the form: https://github.com/YOUR_ACCOUNT/REPO_NAME.